A new DNA alignment method based on inverted index
نویسندگان
چکیده
[Abstract] This paper presents a novel DNA sequences alignment method based on inverted index. Now most large scale information retrieval system are all use inverted index as the basic data structure. But its application in DNA sequence alignment is still not found. This paper just discuss such applications. Three main problems, DNA segmenting, long DNA query search, DNA search ranking algorithm and evaluation method are detailed respectively. This research presents a new avenue to build more effective DNA alignment methods. [Keywords] DNA search engine, BLAST, inverted index
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملE-Bayesian Estimations of Reliability and Hazard Rate based on Generalized Inverted Exponential Distribution and Type II Censoring
Introduction This paper is concerned with using the Maximum Likelihood, Bayes and a new method, E-Bayesian, estimations for computing estimates for the unknown parameter, reliability and hazard rate functions of the Generalized Inverted Exponential distribution. The estimates are derived based on a conjugate prior for the unknown parameter. E-Bayesian estimations are obtained based on th...
متن کاملStrategies for Large-Scale Entity Resolution Based on Inverted Index Data Partitioning
Inverted indexing is a commonly used technique for improving the performance of entity resolution algorithms by reducing the number of pair-wise comparisons necessary to arrive at acceptable results. This chapter describes how inverted indexing can also be used as a data partitioning strategy to perform entity resolution on large datasets in a distributed processing environment. This chapter di...
متن کاملLAF: a new XML encoding and indexing strategy for keyword-based XML search
As a large number of corpuses are represented, stored and published in XML format, how to find useful information from XML databases has become an increasingly important issue. Keyword search enables web users to easily access XML data without the need to learn a structured query language or to study complex data schemas. Most existing indexing strategies for XML keyword search are based upon D...
متن کاملGenomic Classification Using an Information-Based Similarity Index: Application to the SARS Coronavirus
Measures of genetic distance based on alignment methods are confined to studying sequences that are conserved and identifiable in all organisms under study. A number of alignment-free techniques based on either statistical linguistics or information theory have been developed to overcome the limitations of alignment methods. We present a novel alignment-free approach to measuring the similarity...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1307.0194 شماره
صفحات -
تاریخ انتشار 2013